thermal image
MrGS: Multi-modal Radiance Fields with 3D Gaussian Splatting for RGB-Thermal Novel View Synthesis
Kweon, Minseong, Kim, Janghyun, Shin, Ukcheol, Park, Jinsun
Recent advances in Neural Radiance Fields (NeRFs) and 3D Gaussian Splatting (3DGS) have achieved considerable performance in RGB scene reconstruction. However, multi-modal rendering that incorporates thermal infrared imagery remains largely underexplored. Existing approaches tend to neglect distinctive thermal characteristics, such as heat conduction and the Lambertian property. In this study, we introduce MrGS, a multi-modal radiance field based on 3DGS that simultaneously reconstructs both RGB and thermal 3D scenes. Specifically, MrGS derives RGB- and thermal-related information from a single appearance feature through orthogonal feature extraction and employs view-dependent or view-independent embedding strategies depending on the degree of Lambertian reflectance exhibited by each modality. Furthermore, we leverage two physics-based principles to effectively model thermal-domain phenomena. First, we integrate Fourier's law of heat conduction prior to alpha blending to model intensity interpolation caused by thermal conduction between neighboring Gaussians. Second, we apply the Stefan-Boltzmann law and the inverse-square law to formulate a depth-aware thermal radiation map that imposes additional geometric constraints on thermal rendering. Experimental results demonstrate that the proposed MrGS achieves high-fidelity RGB-T scene reconstruction while reducing the number of Gaussians.
- North America > United States > Minnesota (0.04)
- Asia > South Korea > Busan > Busan (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Detecting spills using thermal imaging, pretrained deep learning models, and a robotic platform
Yeghiyan, Gregory, Azar, Jurius, Butani, Devson, Chung, Chan-Jin
This paper presents a real-time spill detection system that utilizes pretrained deep learning models with RGB and thermal imaging to classify spill vs. no-spill scenarios across varied environments. Using a balanced binary dataset (4,000 images), our experiments demonstrate the advantages of thermal imaging in inference speed, accuracy, and model size. We achieve up to 100% accuracy using lightweight models like VGG19 and NasNetMobile, with thermal models performing faster and more robustly across different lighting conditions. Our system runs on consumer-grade hardware (RTX 4080) and achieves inference times as low as 44 ms with model sizes under 350 MB, highlighting its deployability in safety-critical contexts. Results from experiments with a real robot and test datasets indicate that a VGG19 model trained on thermal imaging performs best.
- North America > United States > Michigan > Oakland County > Southfield (0.05)
- North America > United States > Michigan > Wayne County > Livonia (0.04)
See the past: Time-Reversed Scene Reconstruction from Thermal Traces Using Visual Language Models
Contreras, Kebin, Toscano-Palomino, Luis, Mura, Mauro Dalla, Bacca, Jorge
Recovering the past from present observations is an intriguing challenge with potential applications in forensics and scene analysis. Thermal imaging, operating in the infrared range, provides access to otherwise invisible information. Since humans are typically warmer (37 C -98.6 F) than their surroundings, interactions such as sitting, touching, or leaning leave residual heat traces. These fading imprints serve as passive temporal codes, allowing for the inference of recent events that exceed the capabilities of RGB cameras. This work proposes a time-reversed reconstruction framework that uses paired RGB and thermal images to recover scene states from a few seconds earlier. The proposed approach couples Visual-Language Models (VLMs) with a constrained diffusion process, where one VLM generates scene descriptions and another guides image reconstruction, ensuring semantic and structural consistency. The method is evaluated in three controlled scenarios, demonstrating the feasibility of reconstructing plausible past frames up to 120 seconds earlier, providing a first step toward time-reversed imaging from thermal traces.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- South America > Colombia > Santander Department (0.04)
- Asia > Middle East > Iran (0.04)
ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation
Xiao, Jiuhong, Nayak, Roshan, Zhang, Ning, Tortei, Daniel, Loianno, Giuseppe
Paired RGB-thermal data is crucial for visual-thermal sensor fusion and cross-modality tasks, including important applications such as multi-modal image alignment and retrieval. However, the scarcity of synchronized and calibrated RGB-thermal image pairs presents a major obstacle to progress in these areas. To overcome this challenge, RGB-to-Thermal (RGB-T) image translation has emerged as a promising solution, enabling the synthesis of thermal images from abundant RGB datasets for training purposes. In this study, we propose ThermalGen, an adaptive flow-based generative model for RGB-T image translation, incorporating an RGB image conditioning architecture and a style-disentangled mechanism. To support large-scale training, we curated eight public satellite-aerial, aerial, and ground RGB-T paired datasets, and introduced three new large-scale satellite-aerial RGB-T datasets--DJI-day, Bosonplus-day, and Bosonplus-night--captured across diverse times, sensor types, and geographic regions. Extensive evaluations across multiple RGB-T benchmarks demonstrate that ThermalGen achieves comparable or superior translation performance compared to existing GAN-based and diffusion-based methods. To our knowledge, ThermalGen is the first RGB-T image translation model capable of synthesizing thermal images that reflect significant variations in viewpoints, sensor characteristics, and environmental conditions. Project page: http://xjh19971.github.io/ThermalGen
- North America > United States > New York (0.40)
- Europe > Germany > Baden-Württemberg > Freiburg (0.05)
- North America > United States > California > Alameda County > Berkeley (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.87)
HOTSPOT-YOLO: A Lightweight Deep Learning Attention-Driven Model for Detecting Thermal Anomalies in Drone-Based Solar Photovoltaic Inspections
Thermal anomaly detection in solar photovoltaic (PV) systems is essential for ensuring operational efficiency and reducing maintenance costs. In this study, we developed and named HOTSPOT - YOLO, a lightweight artificial intelligence (AI) model that integrat es an efficient convolutional neural network backbone and attention mechanisms to improve object detection. This model is specifically designed for drone - based thermal inspections of PV systems, addressing the unique challenges of detecting small and subtl e thermal anomalies, such as hotspots and defective modules, while maintaining real - time performance. Experimental results demonstrate a mean a verage p recision of 90.8%, reflecting a significant improvement over baseline object detection models. With a reduced computational load and robustness under diverse environmental conditions, HOTSPOT - YOLO offers a scalable and reliable solution for large - scale PV inspections. This work highlights the integration of advanced AI techniques with practical engineering ap plications, revolutionizing automated fault detection in renewable energy systems.
- Europe > Denmark (0.14)
- Asia > Nepal (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
- Asia > India (0.04)
- Overview (0.68)
- Research Report > New Finding (0.54)
AnyTSR: Any-Scale Thermal Super-Resolution for UAV
Li, Mengyuan, Fu, Changhong, Lu, Ziyu, Zhang, Zijie, Zuo, Haobo, Yao, Liangliang
-- Thermal imaging can greatly enhance the application of intelligent unmanned aerial vehicles (UA V) in challenging environments. However, the inherent low resolution of thermal sensors leads to insufficient details and blurred boundaries. Super-resolution (SR) offers a promising solution to address this issue, while most existing SR methods are designed for fixed-scale SR. They are computationally expensive and inflexible in practical applications. T o address above issues, this work proposes a novel any-scale thermal SR method (AnyTSR) for UA V within a single model. Specifically, a new image encoder is proposed to explicitly assign specific feature code to enable more accurate and flexible representation. Additionally, by effectively embedding coordinate offset information into the local feature ensemble, an innovative any-scale upsampler is proposed to better understand spatial relationships and reduce artifacts. Moreover, a novel dataset (UA V-TSR), covering both land and water scenes, is constructed for thermal SR tasks. Experimental results demonstrate that the proposed method consistently outperforms state-of-the-art methods across all scaling factors as well as generates more accurate and detailed high-resolution images.
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Sensing and Signal Processing > Image Processing (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
ThermalDiffusion: Visual-to-Thermal Image-to-Image Translation for Autonomous Navigation
Bansal, Shruti, Wang, Wenshan, Liu, Yifei, Maheshwari, Parv
Autonomous systems rely on sensors to estimate the environment around them. However, cameras, LiDARs, and RADARs have their own limitations. In nighttime or degraded environments such as fog, mist, or dust, thermal cameras can provide valuable information regarding the presence of objects of interest due to their heat signature. They make it easy to identify humans and vehicles that are usually at higher temperatures compared to their surroundings. In this paper, we focus on the adaptation of thermal cameras for robotics and automation, where the biggest hurdle is the lack of data. Several multi-modal datasets are available for driving robotics research in tasks such as scene segmentation, object detection, and depth estimation, which are the cornerstone of autonomous systems. However, they are found to be lacking in thermal imagery. Our paper proposes a solution to augment these datasets with synthetic thermal data to enable widespread and rapid adaptation of thermal cameras. We explore the use of conditional diffusion models to convert existing RGB images to thermal images using self-attention to learn the thermal properties of real-world objects.
- Europe > Germany > Baden-Württemberg > Freiburg (0.08)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Europe > Switzerland (0.04)
- (2 more...)
A Comprehensive Dataset for Underground Miner Detection in Diverse Scenario
Addy, Cyrus, Gurumadaiah, Ajay Kumar, Gao, Yixiang, Awuah-Offei, Kwame
Underground mining operations face significant safety challenges that make emergency response capabilities crucial. While robots have shown promise in assisting with search and rescue operations, their effectiveness depends on reliable miner detection capabilities. Deep learning algorithms offer potential solutions for automated miner detection, but require comprehensive training datasets, which are currently lacking for underground mining environments. This paper presents a novel thermal imaging dataset specifically designed to enable the development and validation of miner detection systems for potential emergency applications. We systematically captured thermal imagery of various mining activities and scenarios to create a robust foundation for detection algorithms. To establish baseline performance metrics, we evaluated several state-of-the-art object detection algorithms including YOLOv8, YOLOv10, YOLO11, and RT-DETR on our dataset. While not exhaustive of all possible emergency situations, this dataset serves as a crucial first step toward developing reliable thermal-based miner detection systems that could eventually be deployed in real emergency scenarios. This work demonstrates the feasibility of using thermal imaging for miner detection and establishes a foundation for future research in this critical safety application.
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Missouri > Phelps County > Rolla (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- (2 more...)
A computer vision-based model for occupancy detection using low-resolution thermal images
Cui, Xue, Zakka, Vincent Gbouna, Lee, Minhyun
Occupancy plays an essential role in influencing the energy consumption and operation of heating, ventilation, and air conditioning (HVAC) systems. Traditional HVAC typically operate on fixed schedules without considering occupancy. Advanced occupant-centric control (OCC) adopted occupancy status in regulating HVAC operations. RGB images combined with computer vision (CV) techniques are widely used for occupancy detection, however, the detailed facial and body features they capture raise significant privacy concerns. Low-resolution thermal images offer a non-invasive solution that mitigates privacy issues. The study developed an occupancy detection model utilizing low-resolution thermal images and CV techniques, where transfer learning was applied to fine-tune the You Only Look Once version 5 (YOLOv5) model. The developed model ultimately achieved satisfactory performance, with precision, recall, mAP50, and mAP50 values approaching 1.000. The contributions of this model lie not only in mitigating privacy concerns but also in reducing computing resource demands.
- Asia > China > Hong Kong (0.05)
- Europe > United Kingdom (0.04)
- Europe > Netherlands > North Brabant > Eindhoven (0.04)
- Construction & Engineering > HVAC (1.00)
- Information Technology > Security & Privacy (0.96)
- Energy (0.89)
MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation
Zuo, Xingxing, Ranganathan, Nikhil, Lee, Connor, Gkioxari, Georgia, Chung, Soon-Jo
Monocular depth estimation (MDE) from thermal images is a crucial technology for robotic systems operating in challenging conditions such as fog, smoke, and low light. The limited availability of labeled thermal data constrains the generalization capabilities of thermal MDE models compared to foundational RGB MDE models, which benefit from datasets of millions of images across diverse scenarios. To address this challenge, we introduce a novel pipeline that enhances thermal MDE through knowledge distillation from a versatile RGB MDE model. Our approach features a confidence-aware distillation method that utilizes the predicted confidence of the RGB MDE to selectively strengthen the thermal MDE model, capitalizing on the strengths of the RGB model while mitigating its weaknesses. Our method significantly improves the accuracy of the thermal MDE, independent of the availability of labeled depth supervision, and greatly expands its applicability to new scenarios. In our experiments on new scenarios without labeled depth, the proposed confidence-aware distillation method reduces the absolute relative error of thermal MDE by 22.88\% compared to the baseline without distillation.